Search CORE

113 research outputs found

CERN: Confidence-Energy Recurrent Network for Group Activity Recognition

Author: Shu Tianmin
Todorovic Sinisa
Zhu Song-Chun
Publication venue
Publication date: 10/04/2017
Field of study

This work is about recognizing human activities occurring in videos at distinct semantic levels, including individual actions, interactions, and group activities. The recognition is realized using a two-level hierarchy of Long Short-Term Memory (LSTM) networks, forming a feed-forward deep architecture, which can be trained end-to-end. In comparison with existing architectures of LSTMs, we make two key contributions giving the name to our approach as Confidence-Energy Recurrent Network -- CERN. First, instead of using the common softmax layer for prediction, we specify a novel energy layer (EL) for estimating the energy of our predictions. Second, rather than finding the common minimum-energy class assignment, which may be numerically unstable under uncertainty, we specify that the EL additionally computes the p-values of the solutions, and in this way estimates the most confident energy minimum. The evaluation on the Collective Activity and Volleyball datasets demonstrates: (i) advantages of our two contributions relative to the common softmax and energy-minimization formulations and (ii) a superior performance relative to the state-of-the-art approaches.Comment: Accepted to IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 201

arXiv.org e-Print Archive

Crossref

Recommended from our members

SLEDGE: Sequential Labeling of Image Edges for Boundary Detection

Author: Payet Nadia
Todorovic Sinisa
Publication venue: 'Springer Fachmedien Wiesbaden GmbH'
Publication date
Field of study

Our goal is to detect boundaries of objects or surfaces occurring in an arbitrary image. We present a new approach that discovers boundaries by sequential labeling of a given set of image edges. A visited edge is labeled as on or off a boundary, based on the edge’s photometric and geometric properties, and evidence of its perceptual grouping with already identified boundaries. We use both local Gestalt cues (e.g., proximity and good continuation), and the global Helmholtz principle of non-accidental grouping. A new formulation of the Helmholtz principle is specified as the entropy of a layout of image edges. For boundary discovery, we formulate a new, policy iteration algorithm, called SLEDGE. Training of SLEDGE is iterative. In each training image, SLEDGE labels a sequence of edges, which induces loss with respect to the ground truth. These sequences are then used as training examples for learning SLEDGE in the next iteration, such that the total loss is minimized. For extracting image edges that are input to SLEDGE, we use our new, low-level detector. It finds salient pixel sequences that separate distinct textures within the image. On the benchmark Berkeley Segmentation Datasets 300 and 500, our approach proves robust and effective. We outperform the state of the art both in recall and precision for different input sets of image edges

ScholarsArchive@OSU